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Abstract. This paper presents efficient algorithms for testing the finite, poly- 
nomial, and exponential ambiguity of finite automata with e-transitions. It gives 
an algorithm for testing the exponential ambiguity of an automaton A in time 
0(| A| %), and finite or polynomial ambiguity in time 0(\ A\'%). These complexi- 
ties significantly improve over the previous best complexities given for the same 
problem. Furthermore, the algorithms presented are simple and are based on a 
general algorithm for the composition or intersection of automata. We also give 
an algorithm to determine the degree of polynomial ambiguity of a finite automa- 
ton A that is polynomially ambiguous in time 0(|A||;), Finally, we present an 
application of our algorithms to an approximate computation of the entropy of a 
probabilistic automaton. 



1 Introduction 

The question of the ambiguity of finite automata arises in a variety of contexts. In some 
cases, the application of an algorithm requires an input automaton to be finitely ambigu- 
ous, in others the convergence of a bound or guarantee relies on that finite ambiguity or 
the asymptotic rate of the increase of ambiguity as a function of the string length. Thus, 
in all these cases, one needs an algorithm to test the ambiguity, either to determine if it 
is finite, or to estimate its asymptotic rate of increase. 

The problem of testing ambiguity has been extensively analyzed in the past. The 
problem of determining the degree of ambiguity of an automaton with finite ambigu- 
ity was shown to be PSPACE-complete. However, testing finite ambiguity can be done 
in polynomial time using a characterization of polynomial and exponential ambiguity 
given by [6,5,9,4, 11]. The most efficient algorithms for testing polynomial and ex- 
ponential ambiguity, and thereby testing finite ambiguity were presented by [10, 12]. 
The algorithms presented in [12] assume the input automaton to be e-free, but they are 
extended to the case where the automaton has e-transitions in [10]. In the presence of 
e-transitions, the complexity of the algorithms given by [10] is 0((| A\e + |^4|q) 2 ) for 
testing the exponential ambiguity of an automaton A and 0((|yl|£ + |A|g) 3 ) for testing 
polynomial ambiguity, where \A\e stands for the number of transitions and \A\q the 
number of states of A. 
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This paper presents significantly more efficient algorithms for testing finite, poly- 
nomial, and exponential ambiguity for the general case of automata with e-transitions. 
It gives an algorithm for testing the exponential ambiguity of an automaton A in time 
0(|A|!;), and finite or polynomial ambiguity in time 0(|A||;). The main idea behind 
our algorithms is to make use of the composition or intersection of finite automata with 
e-transitions [8,7]. The e-filter used in these algorithms crucially helps in the analy- 
sis and test of the ambiguity. We also give an algorithm to determine the degree of 
polynomial ambiguity of a finite automaton A that is polynomially ambiguous in time 
0(\A Finally, we present an application of our algorithms to an approximate com- 
putation of the entropy of a probabilistic automaton. 

The remainder of the paper is organized as follows. Section 2 presents general au- 
tomata and ambiguity definitions. In Section 3 we give a brief description of existing 
characterizations for the ambiguity of automata and extend them to the case of automata 
with e-transitions. In Section 4 we present our algorithms for testing the finite, polyno- 
mial, and exponential ambiguity, and the proof of their correctness. Section 5 details 
the relevance of these algorithms to the approximation of the entropy of probabilistic 
automata. 

2 Preliminaries 

Definition 1. A finite automaton A is a 5-tuple (£, Q, E, I, F) where: £ is a finite 
alphabet; Q is a finite set of states; ICQ the set of initial states; F C Q the set of 
final states; and E C Q x {S U {e}) x Q a finite set of transitions, where e denotes the 
empty string. 

We denote by \A\q the number of states, by \A\e the number of transitions and by 
| .A | = \A\e + \A\q the size of an automaton A. Given a state q G Q, E[q] denotes 
the set of transitions leaving q. For two subsets R C Q and R' C Q, we denote by 
P(R, x, R') the set of all paths from a state q G R to a state q' G R' labeled with 
x G S*. We also denote by p[ir\ the origin state, by n[jr] the destination state, and by 
i [ir] G S* the label of a path it. 

A string x G S* is accepted by A if it labels a successful path, i.e. a path from an 
initial state to a final state. A finite automaton A is trim if every state of A belongs to a 
successful path. A is unambiguous if for any string x G S* there is at most one success- 
ful path labeled by x in A, otherwise, A is said ambiguous. The degree of ambiguity of 
a string x in A, denoted by da(A, x), is the number of successful paths in A labeled by 
x. Note that if A contains an e-cycle, there exist x G S* such that da(A, x) = oo. Using 
a depth-first search restricted to e-transitions, it can be decided in linear time whether 
A has e-cycles. Thus, in the following, we will assume without loss of generality that 
A is e-cycle free. 

The degree of ambiguity of A is defined as da(A) = sup^g^.. da(A, x). A is said 
finitely ambiguous if da(A) < oo and infinitely ambiguous if da(A) = oo. A is said 
polynomially ambiguous if there exists a polynomial h in N[X] such that da(A, x) < 
h(\x\) for all x G S*. The minimal degree of such a polynomial is called the degree 
of polynomial ambiguity of A, denoted by dpa(A). By definition, dpa(A) = iff A is 
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Fig. 1. Illustration of the (a) (EDA), (b) (IDA) and (c) (IDA d ) properties. 

finitely ambiguous. When A is infinitely ambiguous but not polynomially ambiguous, 
we say that A is exponentially ambiguous and that dpa(A) = oo. 

3 Characterization of infinite ambiguity 

The characterization and test of finite, polynomial, and exponential ambiguity of finite 
automata without e-transitions are based on the following fundamental properties. [6, 
5,9,4,11,10,12]. 

Definition 2. The following are three key properties for the characterization of the am- 
biguity of an automata A. 

(a) (EDA): There exists a state q with at least two distinct cycles labeled by some 
v £ S* (Figure 1(a)). 

(b) (IDA): There exist two distinct states p and q with paths labeled with v from p to p, 
p to q, and q to q, for some v G S* (Figure 1(b)). 

(c) (IDAd): There exist 2d states pi, . . . pd, q\, . . . , qj in A and 2d— 1 strings V\, . . . , Vd 
and U2, ■ ■ - Ud in S* such that for all 1 < i < d, pi =/= qi and P(pi,Vi,Pi), 
P(pi, Vi, qi) and P(qi, fi, Qi) are non-empty and for all 2 < i < d, P(gj_i, Ui,pi) 
is non-empty (Figure 1(c)). 

Observe that (EDA) implies (IDA). Assuming (EDA), let e and e' be the first transitions 
that differ in the two cycles at state q, then we must have n [e] ^ n [e 1 ] since the definition 
1 disallows multiple transitions between the same two states with the same label. Thus, 
(IDA) holds for the pair (n[e],n[e']). 

In the e-free case, it was shown that a trim automaton A satisfies (IDA) iff A is 
infinitely ambiguous [11, 12], that A satisfies (EDA) iff A is exponentially ambiguous 
[4], and that A satisfies (IDA,j) iff dpa(A) > d [10, 12]. These characterizations can be 
straightforwardly extended to the case of automata with e-transitions in the following 
proposition. 

Proposition 1. Let Abe a trim e- cycle free finite automaton. 



(i) A is infinitely ambiguous iff A satisfies (IDA). 
(ii) A is exponentially ambiguous iff A satisfies (EDA). 
(Hi) dpa(j4) > d iff A satisfies (IDAd). 

Proof. The proof is by induction on the number of e-transitions in A. If A does not have 
any e-transitions, then the proposition holds as shown in [11,12] for (i), [4] for (ii) and 
[12] for (iii). 

Assume now that A has n + 1 e-transitions, n > 0, and that the statement of the 
proposition holds for all automata with n e-transitions. Select an e-transition eo in A, 
and let A' be the finite automaton obtained after application of e-removal to A lim- 
ited to transition eo. A' is obtained by deleting eo from A and by adding a transition 
(p[eo], l[e], n[e\) for every transition e e _E[n[eo]]. It is clear that A and A' are equiva- 
lent and that there is a label -preserving bijection between the paths in A and A'. Thus, 
(a) A satisfies (IDA) (resp. (EDA), (IDA d )) iff A' satisfies (IDA) (resp. (EDA), (IDA d )) 
and (b) for all x G £*, da(A, x) = da(A', x). By induction, proposition 1 holds for A' 
and thus, it follows from (a) and (b) that proposition 1 also holds for A. □ 

These characterizations have been used in [10, 12] to design algorithms for testing infi- 
nite, polynomial, and exponential ambiguity, and for computing the degree of polyno- 
mial ambiguity in the e-free case. 

Theorem 1 ([10, 12]). Let Abe a trim e-free finite automaton. 

1. It is decidable in time 0(\A\ E ) whether A is infinitely ambiguous. 

2. It is decidable in time 0(\A\ 2 E ) whether A is exponentially ambiguous. 

3. The degree of polynomial ambiguity of A, dpa(A), can be computed in 0(\A\ E ). 

The first result of theorem 1 has also been generalized by [10] to the case of automata 
with e-transitions but with a significantly worse complexity. 

Theorem 2 ([10]). Let Abe a trim e-cycle free finite automaton. It is decidable in time 
0((\A\e + |^4|q) 3 ) whether A is infinitely ambiguous. 

The main idea used in [10] is to defined from A an e-free automaton A' such that A is 
infinitely ambiguous iff A 1 is infinitely ambiguous. However, the number of transitions 
of A 1 is A\e + \A\q. This explains why the complexity in the e-transition case is signif- 
icantly worse than in the e-free case. A similar approach can be used straightforwardly 
to test the exponential ambiguity of A with complexity 0((| j 4|£; + | j 4|q) 2 ) and to com- 
pute dpa(A) when A is polynomially ambiguous with complexity 0((|j4|b + A|q) 3 ). 

Note that we give here tighter estimates of the complexity of the algorithms of [10, 
12] where the authors gave complexities using the loose inequality: \A\e < |^| ■ \A\q. 

4 Algorithms 

Our algorithms for testing ambiguity are based on a general algorithm for the composi- 
tion or intersection of automata, which we describe in the following section both to be 
self-contained, and to give a proof of the correctness of the e-filter which we have not 
presented in earlier publications. 
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Fig. 2. Example of finite automaton intersection, (a) Finite automata Ai and (b) A?,, (c) Result of 
the intersection of A\ and A2. 

4.1 Intersection of finite automata 

The intersection of finite automata is a special case of the general composition algorithm 
for weighted transducers [8,7]. States in the intersection A\ n A2 of two finite automata 
A\ and A2 are identified with pairs of a state of A\ and a state of A^- Leaving aside 
e-transitions, the following rule specifies how to compute a transition of A\ n A2 from 
appropriate transitions of A\ and A^. 



Figure 2 illustrates the algorithm. A state (qi, q 2 ) is initial (resp. final) when q% and q 2 
are initial (resp. final). In the worst case, all transitions of A\ leaving a state q% match 
all those of A 2 leaving state q%, thus the space and time complexity of composition is 
quadratic: 0( | |v4_2 1 ), or 0( |Ai |_e 1^2 |b) when A\ and A2 are trim. 

Epsilon filtering A straightforward generalization of the e-free case would generate 
redundant e-paths. This is a crucial issue in the more general case of the intersection 
of weighted automata over a non-idempotent semiring, since it would lead to an incor- 
rect result. The weight of two matching e-paths of the original automata would then 
be counted as many times as the number of redundant e-paths generated in the result, 
instead of one. It is also a crucial problem in the unweighted case that we are consider- 
ing since redundant e-paths can affect the test of infinite ambiguity, as we shall see in 
the next section. A critical component of the composition algorithm of [8, 7] consists 
however of precisely coping with this problem using a method called epsilon filtering. 

Figure 3(c) illustrates the problemjust mentioned. To match e-paths leaving qi and 
those leaving q 2 , a generalization of the e-free intersection can make the following 
moves: (1) first move forward on an e-transition of q±, or even a e-path, and stay at 
the same state q 2 in A 2 , with the hope of later finding a transition whose label is some 
label a 7^ e matching a transition of q 2 with the same label; (2) proceed similarly by 
following an e-transition or e-path leaving <j2 while staying at the same state qi in A\\ 
or, (3) match an e-transition of q\ with an e-transition of q 2 . 

Let us rename existing e-labels of A\ as e 2 , and existing e-labels of A2 e%, and let 
us augment A\ with a self-loop labeled with ei at all states and similarly, augment A2 
with a self-loop labeled with e2 at all states, as illustrated by Figures 3(a) and (b). These 




(1) 




(a) (b) (c) (d) 



Fig. 3. Marking of automata, redundant paths and filter, (a) At : self-loop labeled with ei added 
at all states of Ai, regular es renamed to £2- (b) A2: self-loop labeled with £2 added at all states 
of A2, regular es renamed to ei. (c) Redundant e-paths: a straightforward generalization of the 
e-free case could generate all the paths from (0, 0) to (2, 2) for example, even when composing 
just two simple transducers, (d) Filter transducer M allowing a unique e-path. 

self-loops correspond to staying at the same state in that machine while consuming an 
e-label of the other transition. The three moves just described now correspond to the 
matches (1) (€2:62), (2) (ei:ei), and (3) (e2:ei). The grid of Figure 3(c) shows all the 
possible e-paths between intersection states. We will denote by A\ and A2 the automata 
obtained after application of these changes. 

For the result of intersection not to be redundant, between any two of these states, all 
but one path must be disallowed. There are many possible ways of selecting that path. 
One natural way is to select the shortest path with the diagonal transitions (e-matching 
transitions) taken first. Figure 3(c) illustrates in boldface the path just described from 
state (0, 0) to state (1, 2). Remarkably, this filtering mechanism itself can be encoded 
as a finite-state transducer such as the transducer M of Figure 3(d). We denote by 
(p, l) d: (r, s) to indicate that (r, s) can be reached from (p, q) in the grid. 

Proposition 2. Let M be the transducer of Figure 3(d). M allows a unique path be- 
tween any two states (p, q) and (r, s), with (p, q) ^ (r, s). 

Proof. Let a denote (eitei), b denote (e2 : £2)> c denote (e2 :e i) 5 an d let x stand for any 
(x:x), with x £ S. The following sequences must be disallowed by a shortest-path filter 
with matching transitions first: ab, ba, ac, be. This is because, from any state, instead of 
the moves ab or ba, the matching or diagonal transition c can be taken. Similarly, instead 
of ac or be, ca and cb can be taken for an earlier match. Conversely, it is clear from the 
grid or an immediate recursion that a filter disallowing these sequences accepts a unique 
path between two connected states of the grid. 

Let L be the set of sequences over a — {a,b,c, x} that contain one of the disallowed 
sequence just mentioned as a substring that is L = a*(ab + ba + ac + bc)a* . Then L 
represents exactly the set of paths allowed by that filter and is thus a regular language. 
Let A be an automaton representing L (Figure 4(a)). An automaton representing L can 
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Fig. 4. (a) Finite automaton A representing the set of disallowed sequences, (b) Automaton B, 
result of the determinization of A. Subsets are indicated at each state, (c) Automaton C obtained 
from B by complementation, state 3 is not coaccessible. 

be constructed from A by determinization and complementation (Figures 4(a)-(c)). The 
resulting automaton C is equivalent to the transducer M after removal of the state 3, 
which does not admit a path to a final state. □ 

Thus, to intersect two finite automata A\ and A^ with e-transitions, it suffices to com- 
pute A\ o M o A2, using the the e-free rules of intersection or composition. 

Theorem 3. Let A\ and A2 be two finite automata with e-transitions. To each pair 
(ttIj 7T2) of successful paths in A\ and A<i sharing the same input label x £ S* corre- 
sponds a unique successful path it in A\ fl A2 labeled by x. 

Proof. This follows straightforwardly from proposition 2. □ 



4.2 Testing for infinite ambiguity 

We start with a test of the exponential ambiguity of A. The key is that the (EDA) prop- 
erty translates into a very simple property for A 2 — AD A. 

Lemma 1. Let Abe a trim e- cycle free finite automaton. A satisfies (EDA) iff there 
exists a strongly connected component of A 2 = A fl A that contains two states of the 
form {p 1 p) and (q, q'), where p, q and q' are states of A with q ^ q'. 

Proof. Assume that A satisfies (EDA). There exist a state p and a string v such that 
there are two distinct cycles c\ and C2 labeled by v at p. Let e\ and &2 be the first 
edges that differ in c\ and c^. We can then write c\ = ire\iT\ and C2 = 7re27T2. If e\ 
and e2 share the same label, let tt[ = irei, tt' 2 = 7re2, 7r" = 7Ti and 7r 2 ' = 7T2. If 
ei and e2 do not share the same label, exactly one of them must be an e-transition. 
By symmetry, we can assume without loss of generality that e\ is the e-transition. Let 
ir[ = irei, tt' 2 = 7r, it" = -K\ and 7r 2 ' = £2^2. In both cases, let q = n[-K[] = p[tt'{] 
and q' = n\n' 2 \ = p\k' 2 '\. Observe that q ^ q' . Since i[7r^] = i[7r 2 ], tt^ and tt' 2 are 
matched by intersection resulting in a path in A 2 from (p,p) to (q, q 1 ). Similarly, since 
i[7r"] = i[TT 2 ], it" and tt 2 are matched by intersection resulting in a path from (q, q') to 
(p,p). Thus, (p,p) and (q, q') are in the same strongly connected component of A 2 . 



Conversely, assume that there exist states p, q and q' in A such that q ^ q 1 and 
that (p,p) and (q, q') are in the same strongly connected component of A 2 . Let c be 
a cycle in (p,p) going through (q, q'), it has been obtained by matching two cycles C\ 
and c-2- If ci were equal to c<i, intersection would match these two paths creating a path 
d along which all the states would be of the form (r, r), and since A is trim this would 
contradict Theorem 3. Thus, c\ and c 2 are distinct and (EDA) holds. □ 

Lemma 1 leads to a straightforward algorithm for testing exponential ambiguity. 

Theorem 4. Let A be a trim e-cycle free finite automaton. It is decidable in time 
0(|A|!;) whether A is exponentially ambiguous. 

Proof. The algorithm proceeds as follows. We compute A 2 and, using a depth-first 
search of A 2 , trim it and compute its strongly connected components. It follows from 
Lemma 1 that A is exponentially ambiguous iff there is a strongly connected component 
that contains two states of the form (p,p) and (q,q') with q ^ q' . Finding such a 
strongly connected component can be done in time linear in the size of A 2 , i.e. in 
O ( | A \ E ) since A and A 2 are trim. Thus, the complexity of the algorithm is in O ( | Ae | 2 ) ■ 

□ 

Testing the (IDA) property requires finding three paths sharing the same label in A. This 
can be done in a natural way using the automaton A 3 = A n A n A, as shown below. 

Lemma 2. Let Abe a trim e-cycle free finite automaton. A satisfies (IDA) iff there exist 
two distinct states p and q in A with a non-e path in A 3, = A l~l A D A from state (p, p, q) 
to state (p, q, q). 

Proof. Assume that A satisfies (IDA). Then, there exists a string v G S* with three 
paths 7Ti G P(p,v,p), 7T2 G P(p,v,q) and tt^ G P(q,v,p). Since these three paths 
share the same label v, they are matched by intersection resulting in a path tt in A 3 
labeled with ?; from (p[7ri],p[7r 2 ],_p[7r 3 ]) = (p,p, q) to (n[ni\, n[n 2 ], n[7r 3 ]) = (p 7 q,q). 

Conversely, if there is a non-e path tt form (p,p, q) to (p, q, q) in A 3 , it has been 
obtained by matching three paths m, TT2 and 1T3 in A with the same input v = i[jr] ^ e. 
Thus, (IDA) holds. □ 

Finally, Theorem 4 and Lemma 2 can be combined to yield the following result. 

Theorem 5. Let A be a trim e-cycle free finite automaton. It is decidable in time 
0(\A\ 3 E ) whether A is finitely, polynomially, or exponentially ambiguous. 

Proof. First, Theorem 4 can be used to test whether A is exponentially ambiguous by 
computing A 2 . The complexity of this step is 0(|A|^). 

If A is not exponentially ambiguous, we proceed by computing and trimming A 3 
and then testing whether A 3 verifies the property described in lemma 2. This is done 
by considering the automaton B on the alphabet Z" = U U {#} obtained from A 3 by 
adding a transition labeled by # from state (p, q, q) to state (p, p, q) for every pair (p, q) 
of states in A such that p 7^ q. It follows that A 3 verifies the condition in lemma 2 iff 
there is a cycle in B containing both a transition labeled by # and a transition labeled 



by a symbol in S. This property can be checked straightforwardly using a depth-first 
search of B to compute its strongly connected components. If a strongly connected 
component of B is found that contains both a transition labeled with # and a transition 
labeled by a symbol in E, A verifies (IDA) but not (EDA) and thus A is polynomially 
ambiguous. Otherwise, A is finitely ambiguous. The complexity of this step is linear in 
the size of B: 0{\B\ E ) = 0{\A E \ 3 + \A Q \ 2 ) = 0{\A E \ 3 ) since A and B are trim. 
The total complexity of the algorithm is 0(|A|| + \ A\ 3 E ) = 0(\A\ 3 E ). 

When A is polynomially ambiguous, we can derive from the algorithm just described 
one that computes dpa(A). 

Theorem 6. Let Abe a trim e-cycle free finite automaton. If A is polynomially ambigu- 
ous, dpa(A) can be computed in time 0(\A\ E ). 

Proof. We first compute A 3 and use the algorithm of theorem 5 to test whether A is 
polynomially ambiguous and to compute all the pairs (p, q) that verify the condition of 
Lemma 2. This step has complexity 0(|A||,). 

We then compute the component graph G of A, and for each pair (p, q) found in the 
previous step, we add a transition labeled with # from the strongly connected compo- 
nent of p to the one of q. If there is a path in that graph containing d edges labeled by 

then A verifies (IDA^). Thus, dpa(A) is the maximum number of edges marked by 
# that can be found along a path in G. Since G is acyclic, this number can be computed 
in linear time in the size of G, i.e. in 0(|A|q). Thus, the overall complexity of the al- 
gorithm is 0(\A\ 3 E ). □ 

5 Application to the Approximation of Entropy 

In this section, we describe an application in which determining the degree of ambigu- 
ity of a probabilistic automaton helps estimate the quality of an approximation of its 
entropy. 

Weighted automata are automata in which each transition carries some weight in 
addition to the usual alphabet symbol. The weights are elements of a semiring, that is a 
ring that may lack negation. The following is a more formal definition. 

Deflnition3. A weighted automaton A over a semiring (K, ©, <£>, 0, 1) is a 7-tuple 
(S ,Q, I , F, E,X, p) where: £ is the finite alphabet of the automaton, Q is a finite 
set of states, I C Q the set of initial states, F C Q the set of final states, E C 
Q x S U {e} xKxQ a finite set of transitions, A : I — > K the initial weight function 
mapping I to K, and p : F — > K the final weight function mapping F to K. 

Given a transition e 6 E, we denote by w[e] its weight. We extend the weight function 
w to paths by defining the weight of a path as the ^-product of the weights of its 
constituent transitions: w[tt] = w[ei]®' ■ -Cgwlefc]. The weight associated by a weighted 
automaton A to an input string x G S* is defined by: 



\A\{x) = A[p[tt]] <g> w[ir] <E> p[n[n]]. 

tt£P{I,x,F) 



(2) 



The entropy H(A) of a probabilistic automaton A is defined as: 



H(A) = -^[i](x)log([4](.)). (3) 

LetK denote (RU{+oo, -oo}) x (RU{+oo, -oo}). The system (K, ffi, ®, (0, 0), (1, 0)) 
where ffi and ffi are defined as follows defines a commutative semiring called the en- 
tropy semiring [2]. For any two pairs (xi,y%) and (2:2,2/2) m K> 

(2:1,2/1) © (£2,2/2) = (a;i +2:2,2/1 +2/2) (4) 
(2:1,2/1) © (2:2,2/2) = (2:1X2, 2:12/2 + 2:22/1)- (5) 

In [2], the authors show that a generalized shortest-distance algorithm over this semir- 
ing correctly computes the entropy of an unambiguous probabilistic automaton A. The 
algorithm starts by mapping the weight of each transition to a pair where the first el- 
ement is the probability and the second the entropy: w[e] 1— ► (w[e], —w[e] logw[e]). 
The algorithm then proceeds by computing the generalized shortest-distance under the 
entropy semiring, which computes the ©-sum of the weights of all accepting paths in 
A. 

In this section, we show that the same shortest-distance algorithm yields an approx- 
imation of the entropy of an ambiguous probabilistic automaton A, where the approxi- 
mation quality is a function of the degree of polynomial ambiguity, dpa(A). Our proofs 
make use of the standard log-sum inequality [3], a special case of Jensen's inequality, 
which holds for any positive reals a±, . . . , Ofe, and b%, . . . , 

i=i ° l \i=i / Z^=i b i 

Lemma 3. Let A be a probabilistic automaton and let x G S + be a string accepted 
by Aon k paths m, . . . , 7Tfc. Let w(ni) be the probability of path 7Tj. Clearly, [A](x) = 

Z)i=i w ( n i)- Then, 

k 

5>(7r 4 )logy;(7r 4 ) > \A\{x)Qag\A\{x) - log ft). (7) 
i=i 

Proof. The result follows straightforwardly from the log-sum inequality, with a.j = 

w{-Ki) and bi = 1: 

J^w^logw^i) > \J2 W ^J log = [A](x)0og[A](x)-]ogk). (8) 

□ 

For a probabilistic automaton A, let S(A) be the quantity computed by the generalized 
shortest-distance algorithm with the entropy semiring. For an unambiguous automaton 
A, S(A) = H(A) [2]. 



Theorem 7. Let Abe a probabilistic automaton and let L denote the expected length 
of strings accepted by A (i.e. L = X^gi;* MI^K 2 ^- Then, 

1. If A is finitely ambiguous with degree of ambiguity k (i.e. da(A) = k for some 
k e N), then H(A) < S(A) < H{A) + logfc. 

2. If A is polynomially ambiguous with degree of polynomial ambiguity k ( i.e. dpa(A) = 
k for some k e N), then H(A) < S{A) < H(A) + klogL. 

Proof. The lower bound, S(A) > H{A) follows from the observation that for a string 
x that is accepted in A by k paths n\, . . . , iik, 

^2 w ( 7r ») iog(w(7Ti)) < w (^)) l °sC^2 w ( it ^)- ( 9 ) 

i—1 i—1 i—1 

Since the quantity — J^t=i wi^i) ^og(w(ni)) is string x's contribution to S(A) and the 
quantity — (X)i=i w{-Ki)) log(X^ = i w ( 7t i)) i ts contribution to H (A), summing over all 
accepted strings x, we obtain H(A) < S(A). 

Assume that A is finitely ambiguous with degree of ambiguity k. Let x 6 S* be a 
string that is accepted on l x < k paths m, . . . , By Lemma 3, 

£ tufa) log tufa) > [A](a:)(log[A](a;) - log/,) > M(x)(log[A](x) - log fc). (10) 

Thus, 

W = " E £™(<>log™(<) < #(^) + E (logfc)MW =^(A)+logfc. (11) 

This proves the first statement of the theorem. 

Next, assume that A is polynomially ambiguous with degree of polynomial ambi- 
guity k. By Lemma 3, 

£>(tt0 log tufa) > [A]{x)0og[A](x) - logQ > [A](x)Qog[A](x) - log(\x\ k )). (12) 
Thus, 

S(A)<H(A)+ £ fc[A](a;)log|x| = J ff(A) + fcE A [log|a;|] (13) 

< Jf(A) + fclogEA[|ar|] = H(A) + k\ogL, (by Jensen's inequality) 

which proves the second statement of the theorem. □ 

The quality of the approximation of the entropy of a probabilistic automaton A depends 
on the expected length L of an accepted string. L can be computed efficiently for an 
arbitrary probabilistic automaton using the expectation semiring and the generalized 
shortest-distance algorithms, using techniques similar to the ones described in [2]. The 
definition of the expectation semiring is identical to the entropy semiring. The only 
difference is in the initial step, where the weight of each transition in A is mapped to a 
pair of elements. Under the expectation semiring, the mapping is w[e] >— > (w[e], w[e\). 



6 Conclusion 



We presented simple and efficient algorithms for testing the finite, polynomial, or expo- 
nential ambiguity of finite automata with e-transitions. We conjecture that the running- 
time complexity of our algorithms is optimal. These algorithms have a variety of ap- 
plications, in particular to test a pre-condition for the applicability of other automata 
algorithms. Our application to the approximation of the entropy gives another illustra- 
tion of the applications of these algorithms. 

Our algorithms also illustrate the prominent role played by the general algorithm 
for the intersection or composition of automata and transducers with e-transitions in the 
design of testing algorithms. Composition can be used to devise simple and efficient 
testing algorithms. We have shown elsewhere how it can be used to test the functional- 
ity of a finite-state transducer or to test the twins property for weighted automata and 
transducers [1]. 
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